U.S. Application No.: 09/592.302 Attorney Docket No.: CIS00-2410 

-2- 

IN THE CLAIMS 

1 . (Currently Amended) In a server, a method for providing information 
suitable for audio output, the method comprising: 

receiving a web page including a first set of information over a 
network based on a request for the first set of information, receiving the 
first set of information further comprising: 

receiving speech information specifying the first set of 
information; 

generating a text request for the first set of information 
based on an acoustic speech recognition (ASR) technique applied 
to the speech information, generating including interpreting at least 
one primitive construct based on the speech information and 
generating at least one additional primitive construct based on a 
request for a user-defined command, and 

submitting the text request over the network; 
accessing a tagged document in response to receiving the first set 
of information, the tagged document defined as an XML filtering 
document, accessing the tagged document further including: 

determining an identity of the request for the first set of information; 

and 

accessing the tagged document based on the identity of the request, wherein the 
identity of the request is based on at least one of an identifier for an 
originator of the request and an identifier for a destination of the request; 
and 

generating a second set of information including subsets of the web 
page suitable for audio output based on the first set of information and the 
tagged document, generating the second set of information suitable for 
audio output further comprising: 
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selecting, based on predetermined expected patterns in the 
filtering document, at least one portion of the first set of information 
that is suitable for audio output; and 

generating the second set of information based on selecting 
the at least one portion of the first set of information. 

2. (Original) The method of claim 1, wherein: 

the step of receiving the first set of information comprises receiving 
a web page based on a Uniform Resource Locator (URL) request for the 
web page; 

the step of accessing the tagged document comprises accessing 
an Extensible Markup Language (XML) document; and 

the step of generating the second set of information comprises 
generating filtered web content suitable for audio output based on the 
web page and the XML document. 

Claims 3-5. (Cancelled). 

6. (Previously Presented) The method of claim 2, wherein the step of 

generating the text request comprises applying a case-match technique to 
the speech information. 



Claims 7-9. (Cancelled) 

10. (Original) The method of claim 1 , wherein the step of generating the 
second set of information suitable for audio output comprises: 

generating text data suitable for audio output based on the first set 
of information and the tagged document, and 

generating audio data based on the text data. 
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11. (Original) The method of claim 10, wherein the step of generating the text 
data suitable for audio output comprises generating at least one response 
and the step of generating the audio data based on the text data 
comprises applying a text-to-speech (TTS) technique to the at least one 
response. 

12. (Original) The method of claim 1 , wherein the step of accessing the 
tagged document is performed based on the request for the first set of 
information and approximately concurrently with the step of receiving the 
first set of information. 

13. (Original) The method of claim 1 , wherein each of the first set of 
information, the tagged document, and the second set of information is at 
least one of a Hypertext Markup Language (HTML) page, an Extensible 
Markup Language (XML) page, a Virtual Reality Modeling Language 
(VRML) page, and a Standard Generic Markup Language (SGML) page. 

14. (Currently Amended) A system for providing information suitable for audio 
output, the system comprising: 

a document database configured for storing a plurality of tagged 
documents; and 

a server comprising an executable resource, wherein the 
executable resource: 

receives a web page including a first set of information over a 
network based on a request for the first set of information, the executable 
resource further operable to generate a text request for the first set of 
information based on an acoustic speech recognition (ASR) technique 
applied to the speech information, and submits the text request over the 
network, generating the text request further including: 
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receiving speech information specifying the first set of 
information Interpreting at least one primitive construct based on 
the speech information; and 

generating at least one additional primitive construct based 
on a request for a user-defined command; 

accesses a tagged document defined as an XML filtering 
document f rom the document database based on receiving the first set of 
information by 

determining an identity of the request for the first set of 
information; and 

accessing the tagged document based on the identity of the 
request, wherein the identity of the request is based on at least one 
of an identifier for an originator of the request and an identifier for a 
destination of the request; and 

generates the second set of information including subsets of the 
web page suitable for audio output based on the first set of information 
and the tagged document, such that the executable resource selects, 
based on predetermined expected patterns in the filtering document, at 
least one portion of the first set of information that is suitable for audio 
output, and generates the second set of information based on selecting 
the at least one portion of the first set of information. 

15. (Original) The system of claim 14, wherein the first set of information is a 
web page based on a Uniform Resource Locator (URL) request for the 
web page; the tagged document is an Extensible Markup Language (XML) 
document; and the second set of information is filtered web content 
suitable for audio output based on the web page and the XML document. 
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16. (Original) The system of claim 14, wherein the executable resource 
receives speech information specifying the first set of information, 
generates a text request for the first set of information based on an 
acoustic speech recognition (ASR) technique applied to the speech 
information, and submits the text request over the network. 

Claims 17-18. (Cancelled). 

19. (Previously Presented) The system of claim 14, wherein the executable 
resource applies a case-match technique to the speech information to 
generate the text request. 

Claims 20-21. (Cancelled). 

22. (Original) The system of claim 14, wherein the executable resource 
selects at least one portion of the first set of information that is suitable for 
audio output, and generates the second set of information based on 
selecting the at least one portion of the first set of information. 

23. (Original) The system of claim 14, wherein the executable resource 
generates text data suitable for audio output based on the first set of 
information and the tagged document, and the executable resource 
generates audio data based on the text data. 

24. (Original) The system of claim 23, wherein the text data comprises at 
least one response, and the executable resource applies a text-to-speech 
(TTS) technique to the at least one response to generate the audio data. 
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25. (Original) The system of claim 14, wherein the executable resource, in an 
approximately concurrent time frame: 

accesses the tagged document based on the request for the first 
set of information, 

and receives the first set of information. 



26. (Original) The system of claim 14, wherein each of the first set of 

information, the tagged document, and the second set of information is at 
least one of a Hypertext Markup Language (HTML) page, an Extensible 
Markup Language (XML) page, a Virtual Reality Modeling Language 
(VRML) page, and a Standard Generic Markup Language (SGML) page. 



27. (Currently Amended) A computer program product that includes a 
computer readable medium having instructions stored thereon for 
providing information suitable for audio output, such that the instructions, 
when carried out by a computer, cause the computer to perform the steps 
of: 

receiving a web page including a first set of information over a 
network based on a request for the first set of information, receiving the 
first set of information further comprising: 

receiving speech information specifying the first set of 

information; 

generating a text request for the first set of information 
based on an acoustic speech recognition (ASR) technique applied 
to the speech information, generating including interpreting at least 
one primitive construct based on the speech information and 
generating at least one additional primitive construct based on a 
request for a user-defined command, and 

submitting the text request over the network; 
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accessing a tagged document defined as an XML filtering 
document in response to receiving the first set of information, accessing 
the tagged document further including: 

determining an identity of the request for the first set of 

information; and 

accessing the tagged document based on the identity of the 

request, wherein the identity of the request is based on at least one 

of an identifier for an originator of the request and an identifier for a 

destination of the request; and 
generating a second set of information including subsets of the web page 
suitable for audio output based on the first set of information and the 
tagged document, generating the second set of information suitable for 
audio output further comprising: 

selecting, based on predetermined expected patterns in the filtering 
document, at least one portion of the first set of information that is 
suitable for audio output; and 

generating the second set of information based on selecting the at 
least one portion of the first set of information. 

28. (Original) The computer program product of claim 27, wherein: 

the step of receiving the first set of information comprises receiving 
a web page based on a Uniform Resource Locator (URL) request for the 
web page; 

the step of accessing the tagged document comprises accessing 
an Extensible Markup Language (XML) document; and 

the step of generating the second set of information comprises 
generating filtering web content suitable for audio output based on the 
web page and the XML document. 
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29. (Currently Amended) A computer program propagated signal product 
embodied in a propagated medium, having instructions for providing 
information suitable for audio output, such that the instructions, when 
carried out by a computer, cause the computer to perform the steps of: 

receiving a web page including a first set of information over a 
network based on a request for the first set of information, receiving the 
first set of information further comprising: 

receiving speech information specifying the first set of 
information; 

generating a text request for the first set of information 
based on an acoustic speech recognition (ASR) technique applied 
to the speech information, generating including interpreting at least 
one primitive construct based on the speech information and 
generating at least one additional primitive construct based on a 
request for a user-defined command, and 

submitting the text request over the network; 
accessing a tagged document defined as an XML filtering 
document in response to receiving the first set of information, accessing 
the tagged document further including: 

determining an identity of the request for the first set of 
information; and 

accessing the tagged document based on the identity of the 
request, wherein the identity of the request is based on at least one 
of an identifier for an originator of the request and an identifier for a 
destination of the request; and 

generating a second set of information including subsets of the web page 
suitable for audio output based on the first set of information and the 
tagged document, generating the second set of information suitable for 
audio output further comprising: 
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selecting, based on predetermined expected patterns in the filtering 
document, at least one portion of the first set of information that is 
suitable for audio output; and 

generating the second set of information based on selecting the at 
least one portion of the first set of information. 

30. (Original) The computer program propagated signal product of claim 29, 
wherein: 

the step of receiving the first set of information comprises receiving 
a web page based on a Uniform Resource Locator (URL) request for the 
web page; 

the step of accessing the tagged document comprises accessing 
an Extensible Markup Language (XML) document; and 

the step of generating the second set of information comprises 
generating filtered web content suitable for audio output based on the 
web page and the XML document. 

31 . (Currently Amended) A system for providing information suitable for audio 
output, the system comprising: 

a document database configured for storing a plurality of tagged 
document pages; 

means for producing a second set of information suitable for audio 
output, wherein the producing means receives a web page including a 
first set of information over a network based on a request for the first set of 
information, receiving the first set of information further comprising: 
receiving speech information specifying the first set of 

information; 

generating a text request for the first set of information 
based on an acoustic speech recognition (ASR) technique applied 
to the speech information, generating including interpreting at least 
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one primitive construct based on the speech information and 

generating at least one additional primitive construct based on a 

request for a user-defined command, and 

submitting the text request over the network; 

accesses a tagged document defined as an XML filtering document 
from the document database based on receiving the first set of information 
by: 

determining an identity of the request for the first set of 

information; and 

accessing the tagged document based on the identity of the 

request, wherein the identity of the request is based on at least one 

of an identifier for an originator of the request and an identifier for a 

destination of the request; and 
generating the second set of information including subsets of the web page 
suitable for audio output based on the first set of information and the 
tagged document, generating the second set of information suitable for 
audio output comprises: 

selecting, based on predetermined expected patterns in the filtering 
document, at least one portion of the first set of information that is 
suitable for audio output; and 

generating the second set of information based on selecting the at 
least one portion of the first set of information. 

32. (Original) The system of claim 31 , wherein the first set of information is a 
web page based on a Uniform Resource Locator (URL) request for the 
web page; the tagged document is an Extensible Markup Language (XML) 
document; and the second set of information is filtered web content 
suitable for audio output based on the web page and the XML document. 
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33. (Currently Amended) A method for navigating a web by voice in a server 
configured for executing voice web applications, the method comprising: 
requesting a web page including a first set of information based on 
a voice web navigation request, requesting the web page further 
comprising: 

receiving speech information specifying the first set of 
information; 

generating a text request for the first set of information 
based on an acoustic speech recognition (ASR) technique applied 
to the speech information, generating including interpreting at least 
one primitive construct based on the speech information and 
generating at least one additional primitive construct based on a 
request for a user-defined command, and 

submitting the text request over the network; 
receiving a retrieved web page based on the voice web navigation 
request; 

accessing aft tagged document defined as an XML filtering 
document e xt e ns i bl e m a rkup l anguag e page in response to receiving the 
retrieved web page, accessing the tagged document further including: 
determining an identity of the request for the first set of 
information; and 

accessing the tagged document based on the identity of the 
request, wherein the identity of the request is based on at least one 
of an identifier for an originator of the request and an identifier for a 
destination of the request; 
generating filtered web content including subsets of the web page suitable for 
audio output based on the retrieved web page and the extensible markup 
language page; and 

generating the at least one audio output file based on the filtered 
web content, generating audio output file further comprising: 
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selecting, based on predetermined expected patterns in the filtering 
document, at least one portion of the retrieved web page that is suitable 
for audio output; and 

generating the audio output file based on selecting the at least one 
portion of the first set of information. 

34. (Original) The method of claim 33, wherein the step of requesting the web 
page based on the voice web navigation request comprises the steps of: 

receiving speech information specifying the web page; 

generating a text request for the web page based on an acoustic 
speech recognition (ASR) technique applied to the speech information, 
and 

submitting the text request over the network. 

35. (Original) The method of claim 33, wherein the step of accessing the 
extensible markup language document in response to receiving the 
retrieved web page comprises: 

determining an identity of the voice web navigation request for the 
web page, and 

accessing the extensible markup language page based on the 
identity of the voice web navigation request. 

36. (Original) The method of claim 35, wherein the identity of the request is 
based on at least one of an identifier for an originator of the voice web 
navigation request and an identifier for a destination of the voice web 
navigation request. 

37. (Original) The method of claim 33, wherein the step of generating the 
filtered web content suitable for audio output comprises: 
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generating text data suitable for audio output based on the 
retrieved web page and the extensible markup language document, and 
generating audio data based on the text data. 



38. (Previously Presented) The method of claim 1 wherein the method of 
accessing a tagged document comprises accessing a plurality of tagged 
documents, the plurality of tagged documents to define user interface 
logistics and to operate the server; and, 

wherein the method of generating a second set of information 
comprises generating a second set of information suitable for audio input 
based on the first set of information and the plurality of tagged documents. 

39. (Previously Presented) The method of claim 38 wherein the plurality of 
tagged documents includes at least one menu document, at least one 
activity document, at least one decision document and at least one 
application state document. 



40. (Previously Presented) The method of claim 38 wherein the plurality of 
tagged documents includes at least one filtering document to be applied to 
the first set of information to generate the second set of information 
suitable for audio output. 

41. (Previously Presented) The method of claim 1 wherein the step of 
generating the second set of information further comprises the step of 
executing voice application operations from the tagged document to 
generate the information suitable for audio output. 



42. 



(Currently Amended) A method for voice-based navigation in a server 
configured for executing voice web applications comprising: 
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receiving a voice-based request to navigate the web from an audio 
communication device operable to provide the voice-based request in response 
to a menu generated based on a specific application-defining document operable 
to provide parameters and options; 

associating the voice-based request with the specific application-defining 
document; 

searching for primitive constructs in the voice-based request; 

constructing a text-based request based on the primitive constructs 
identified from the voice-based request; 

generating the text-based request to navigate the web based on the 
primitive constructs in the voice-based request from at least one of a database 
and a proxy server; 

requesting the web page using the text-based web navigation request by 
posting a generated URL to a web server to execute the request for the web 
page; 

receiving the requested web page from the web server; 

accessing a tagged document defined as an XML filtering document page 
from an application document database using the application-defining document 
associated with the voice-based request, the filtering document page employing 
a markup language and operable to filter the retrieved web page to provide 
generated content suitable for audio output, the requesting the web page and 
accessing the filtering document occurring in a substantially concurrent time 
frame; 

generating the filtered web content including subsets of the web page from 
the retrieved web page and the filtering document page indicated by the 
application-defining document associated with the voice-based request; 
generating at least one audio output file based on the filtered web content via a 

text-to-speech (TTS) technique operable to convert the text in the filtered 

web content to audio output files, generating the audio output file further 

comprising: 
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selecting, based on predetermined expected patterns in the filtering 
document, at least one portion of the retrieved web page that is suitable 
for audio output; and 

generating the audio output file based on selecting the at least one 
portion of the first set of information; and 
sending the signals via a network connection to the user audio 
communication device. 

43. (Previously Presented) The method of claim 42 wherein the voice based 
request is operative to identify a particular user via a user identifier number 
indicative of an LDAP resource having personal data and class of data 
information on individual users. 

44. (Previously Presented) The method of claim 42 wherein a web navigation 
application uses a case-match approach to interpret the primitive constructs and 
determine web navigation commands are included in the text-based request. 

45. (Previously Presented) The method of claim 43 further comprising 
sending the filtered web content in an HTML page to an intermediary proxy 
browser operable to generate signals which the user audio communication 
device converts to audible sound. 



